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Abbreviations : 


Halobacterium  cutirubrum 

Halobacterium  holobium  (same  species  as  Hcu) 
Halobacterium  volcanii 
Sulfolobus  acidocaldorius 
Escherichia  coli 


Research  Objectives 


(i)  to  characterize  the  principles  of  gene  organization  and  regulation  of 
gene  expression  in  archaebacteria;  (ii)  to  elucidate  the  evolutionary 
relationship  between  these  novel  organisms  and  the  traditional  eubacterial  and 
eucaryotic  organisms;  (iii)  to  understand  in  biophysical  and  molecular  terms 
some  of  the  mechanisms  that  allow  archaebacteria  to  inhabit  extreme 
environments. 

Progress  -  Year  1 

Ribosomal  protein  genes:  The  ribosoml  A  protein  complex  forms  the  stalk 
structure  on  the  large  ribosome  subunit  and  is  comprised  of  four  copies  (two 
dimers)  of  L12e  and  one  copy  of  LlOe  ribosomal  protein.  The  LlOe  protein 
either  binds  to  ribosomal  RNA  either  directly  or  through  the  Llle  protein 
which  forms  a  bulge  at  the  base  of  the  stalk.  This  domain  on  the  large 
subunit  is  the  site  of  factor  binding  and  associated  GTPase  activities  during 
the  protein  synthesis  cycle  and  is  a  conserved  and  defined  feature  on  the 
ribosome  from  all  organisms  (1).  In  Eco  the  genes  encoding  the  Lll,  LI,  L10 
and  L12  ribosomal  proteins  are  located  within  a  3.0  Kb  region  of  genomic  DNA. 
This  region  has  been  cloned,  sequenced  and  extensively  characterized  (2,3). 

Partial  amino  acid  sequences  of  many  archaebacterial  ribosomal  proteins 
have  been  determined  (4).  We  used  the  sequences  from  the  Hcu  Llle  and  L12e 
proteins  to  deduce  and  synthesize  oligonucleotide  sequences  complementary  to 
the  corresponding  genes.  We  cloned  initially  a  1.2  Kb  Pst-Bam  fragment 
encoding  L12e  and  subsequently  a  5.2  Kb  Cla-Bam  fragment  containing  both  Llle 
and  L12e.  The  sequence  of  the  entire  5.2  Kb  fragment  was  determined  and  it 
was  found  to  contain  several  long  open  reading  frames.  Four  of  these  open 
reading  frames  were  shown  to  encode  the  Llle,  Lie,  LlOe  and  L12e  ribosomal 
proteins.  The  gene  order  as  present  in  Eco  is  conserved  in  Hcu  although  the 
transcriptional  organization  will  certainly  be  much  different  (work  in 
progress) . 

Using  a  similar  strategy  the  corresponding  genomic  region  of  Sac  has  been 
cloned  and  sequenced.  It  contains  the  same  four  genes  in  the  same  order. 

From  the  Hcu  and  Sac  nucleic  acid  sequence  we  have  deduced  the  amino  acid 
sequences  for  the  four  ribosoml  proteins  and  have  begun  to  align  the  protein 
to  their  eubacterial  and  eucaryotic  equivalent  and  deduce  their  evolutionary 
relationships.  This  analysis  has  been  completed  for  the  LlOe  and  L12e  protein 
(5). 

There  are  presently  four  complete  sequences  of  LlOe  proteins  (one 
eubacterial,  two  archaebacterial  and  one  eucaryotic)  and  16  complete  sequences 
of  L12e  (eight  eubacterial,  three  archaebacterial  and  five  eucaryotic).  Using 
these  sequences  we  have  carried  out  inter-kingdom  LlOe  and  L12e  alignments. 

The  LlOe  protein  from  the  three  kingdoms  were  found  to  be  co-linear.  (A  cDNA 
sequence  published  by  Rich  and  Steitz  was  shown  by  us  to  encode  the  human  LlOe 
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protein  (6)).  The  eubacterial  protein  is  much  shorter  than  the 
archaebacterial  and  eucaryotic  proteins  because  of  two  large  deletions,  one 
internal  and  one  at  the  carboxy  terminus.  The  two  archaebacterial  LlOe 
proteins  were  most  similar;  they  exhibit  27%  amino  acid  identity  with  only  a 
single  deletion- insertion.  Inter-kingdom  comparisons  exhibit  15-25%  amino 
acid  identity  with  5-7  deletion  insertions. 


The  L12e  protein  from  archaebacteria  and  eucaryotes  can  also  be  aligned 
and  are  co-linear.  The  eubacterial  protein  could  not  be  made  to  fit  this 
pattern  although  two  regions  appear  to  have  homologous  domain  in  the 
archaebacterial  and  eucaryotic  proteins.  The  C-terminal  domain  of  the  Eco 
L12e  protein  has  been  shown  to  contain  (i)  a  conserved  face  for  interaction 
with  extrinsic  translation  factors;  (ii)  an  anion  (potential  GTP)  binding 
site;  and  (iii)  a  putative  L12e-L12e  dimerization  site  (1).  This  domain  is 
separated  from  the  N-terminus  by  an  unstructured  ala-pro-rich  hinge  sequence 
and  aligns  to  a  region  near  the  amino  terminus  of  the  archaebacterial  and 
eucaryotic  proteins.  Inter-kingdom  comparisons  range  from  18-36  percent  amino 
acid  sequence  identity  over  their ‘domain.  The  second  region  conserved  in  all 
proteins  is  the  ala-pro-rich  sequence.  In  eubacteria  this  sequence  precedes 
the  facter  interaction  domain;  in  archaebacteria  and  eucaryotic  it  follows  the 
facter  interaction  domain.  The  H- terminus  of  the  eubacterial  L12e  protein 
exhibits  more  similarity  to  its  own  carboxy  end  than  to  any  region  within  the 
archaebacterial-eucaryotic  protein.  The  carboxy  termini  of  the 
archaebacterial  and  eucaryotic  protein  are  highly  similar  to  each  other  and 
apparently  not  represented  in  the  eubacterial  L12e  protein. 


Intraspecies  comparisons  between  LlOe  and  L12e  proteins  indicate  that  the 
LlOe  protein  of  archaebacteria  and  eucaryotes  contains  a  partial  copy  of  the 
L12e  protein  fused  to  its  carboxy  terminus.  (In  eubacteria  most  of  this 
fusion  has  been  removed  from  LlOe  by  the  carboxy  terminal  deletion  and  the 
L12e  protein  has  been  restructured . )  For  the  two  Sac  proteins  the  31  carboxy 
teminal  residues  of  LlOe  and  L12e  are  identical;  conservation  at  the 
nucleotide  level  is  also  perfect.  This  homology  extends  further  into  the 
central  regions  of  the  LlOe  and  L12e  proteins  and  includes  a  26  amino  acid 
long  module,  reiterated  thrice  in  archaebacterial  LlOe,  twice  in  eucaryotic 
LlOe  and  once  in  the  corresponding  L12e  protein.  Careful  examination 
indicates  that  a  single  copy  of  the  modular  sequence  is  also  present  in  the 
L12e  and  LlOe  protein  of  eubacteria.  This  modular  sequence  may  play  a  role  in 
L12e  dimerization,  L10e-L12e  complex  formation  and  the  function  of  the 
L10e-L2e  complex  in  translation. 


Based  upon  these  alignments  and  shared  homologies  a  model  has  been 
constructed  to  depict  the  evolution  of  the  primordial  LlOe  and  L12e  genes  and 
proteins  in  the  three  phylogenetic  groups  (5). 


Superoxide  dismutase:  Previously  we  had  purified  and  characterized  the 
SOD  enzyme  activity  from  Hcu  (7).  An  N- terminal  amino  acid  sequence  was 
determined,  an  oligonucleotide  was  synthesized  and  the  gene  encoding  the  SOD 
enzyme  was  cloned  as  a  1.1  Kb  Sau3A  fragment  and  sequenced.  The  fragment 
contain  a  single  long  open  reading  frame  encoding  a  protein  of  200  amino 
acids.  The  first  56  codons  of  the  ORF  correspond  to  the  amino  acid  sequence 
determined  from  the  purified  protein.  The  gene  encodes  a  typical  Mn/Fe  ion 
type  enzyme  as  evidenced  by  the  fact  that  (i)  it  has  an  acidic  pi;  (ii)  it 


contains  no  cys  residues;  (iii)  it  is  relatively  abundant  in  tyr  and  trp 
residues;  (iv)  it  exhibits  41%  amino  acid  identity  with  the  Mn  SOD  of  Bacillus 
stearothermophilus  and  (v)  it  conserves  the  four  residues,  his  29,  his  76,  asp 
158  and  his  162  that  in  the  Bacillus  enzyme  (position  26,  81,  163  and  167 
respectively)  are  utilized  for  Mn  binding. 

By  SI  nuclease  protection  and  primer  extension  analysis  we  have  shown 
that  the  SOD  mRNA  is  about  242  nucleotides  in  length.  It  begins  two 
nucleotides  in  front  of  the  ATG  translation  initiation  codon  and  terminates 
about  40  nucleotides  downstream  of  the  TAA  translation  termination  codon,  in  a 
T5  sequence  that  is  preceded  by  a  GC-rich  sequence.  This  terminator 
sequence  appears  similar  to  other  Hcu  terminators.  Analysis  of  this  and  other 
5*  flanking  sequences  have  yet  to  reveal  the  concensus  sequence  involved  in 
promoter  recognition  by  SNA  polymerase  in  Hcu  (that  is,  there  is  little 
conservation  between  the  bacterio-opsin,  SOD,  rRNA  and  r-protein  5*  flanking 
sequences) . 

Like  bacterio-opsin,  the  mRHA  encoding  SOD  starts  only  two  nucleotides  in 
front  of  the  ATG  translation  initiation  codon.  There  is  no  homolog  of  the 
Shine  Dalgamo  purine-rich  sequence  of  eubacteria  that  is  complementary  to  the 
3’  end  of  16S  rRNA  on  the  SOD  mRNA.  In  contrast  some  r-protein  genes  of  Hcu 
retain  this  sequence.  Thus  it  seems  that  Hcu  ribosomes  can  initiate 
translation  by  two  separate  mechanisms:  at  the  first  AUG  on  the  transcript 
(SOD  and  bacterio-opsin)  or  by  a  Shine  Dalgamo  interaction  at  internal  AUG 
codons . 

We  have  discovered  that  the  SOD  gene  in  Hcu  is  subject  to  regulation. 
Addition  of  paraquat,  a  generator  of  oxygen  radicals,  to  an  exponential  phase 
culture  results  in  the  5-10x  induction  of  SOD  enzyme  activity.  After  several 
days  of  growth  in  the  presence  of  paraquat  the  SOD  activity  returns  to 
normal.  The  culture  has  become  resistant  to  the  inhibitor  and  the  frequency 
suggests  that  resistance  arises  by  transposition  insertion  (as  in  bacterio- 
opsin  negative  mutants  of  Hha;  8)  into  a  gene  specifying  sensitivity  (i.e. 
transport  of  paraquat  across  the  cell  envelope) .  This  observation  is  being 
investigated. 
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Objectives  for  Year  2 

Gene  encoding  r-protein 

1)  Carry  out  inter-kingdom  alignment  and  analysis  of  Llle  and  Lie  protein; 

2)  Define  the  transcriptional  organization  of  the  Llle,  Lie,  LlOe  and  L12e 
gene  clusters  in  Hcu  and  Sac,  map  5'  and  3'  transcript  ends,  cap  5' 
transcript  ends  and  analyze  for  conserved  promoter  and  terminator 
sequences . 

Superoxide  dismutase 

3)  Cap  mRNA  and  characterize  induction  by  paraquat  at  the  level  of 
transcription; 

4)  Look  for  DNA  binding  protein  to  the  SOD  promoter  region  by  in  vivo 
foot-printing; 

5)  Clone  and  characterize  SOD  gene  from  a  genetically  stable  strain  Hvo, 
isolate  paraquat-sensitive  (null)  and  resistant  (constitutive)  mutants; 

6)  Utilize  the  transformation  system  of  Hvo  to  characterize  the  genetic 
regulation  of  the  SOD  gene. 

Processing  of  rRNA 

7)  Characterize  promoter  and  processing  signals  in  the  separate  16S-23S  and 
5S  transcription  units  of  Sac; 

8)  Utilize  cloned  copies  of  the  16S  inverted  repeat  processing  signal  from 
Hcu  to  develop  an  assay  for  HNaselll-like  activity; 

9)  Attempt  to  purify  Hcu  RNaselll-like  activity,  clone  the  gene  and  study 
substrate  specificity  by  in  vitro  site-specific  mutagenesis  of  cloned  16S 
inverted  repeat  sequence. 

Publications 
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